Introduction
Welcome!
Code Examples
Day 1
Challenge: KISS
Solution: KISS
Day 2
Challenge: Type Annotations
Solution: Type Annotations
Day 3
Challenge: Decoupling
Solution: Decoupling
Day 4
Challenge: DRY
Solution: DRY
Day 5
Challenge: String Formatting
Solution: String Formatting
Day 6
Challenge: Law Of Demeter
Solution: Law Of Demeter
Day 7
Challenge: Better Discounts
Solution: Better Discounts
Day 8
Challenge: Payment Strategy
Solution: Payment Strategy
Day 9
Challenge: Plugins
Solution: Plugins
Day 10
Challenge: Object Oriented To Functional
Solution: Object Oriented To Functional
Day 11
Challenge: Cohesion
Solution: Cohesion
Day 12
Challenge: MVP
Solution: MVP
Day 13
Challenge: Inheritance
Solution: Inheritance
Day 14
Challenge: Abstraction
Solution: Abstraction
Day 15
Challenge: Higher-Order Functions
Solution: Higher-Order Functions
Day 16
Challenge: Configuration
Solution: Configuration
Day 17
Challenge: Concurrency
Solution: Concurrency
Day 18
Challenge: Refactoring
Solution: Refactoring
Day 19
Challenge: Itertools
Solution: Itertools
Day 20
Challenge: Inappropriate Intimacy
Solution: Inappropriate Intimacy
Wrap Up
End of Part 1

Here my solution:
from dataclasses import dataclass
from faker import Faker
import random
import itertools
import operator
@dataclass
class Person:
name: str
age: int
city: str
country: str
# Instantiate the Faker module
fake = Faker()
# List of possible countries
countries = [
"UK",
"USA",
"Japan",
"Australia",
"France",
"Germany",
"Italy",
"Spain",
"Canada",
"Mexico",
]
# Generate 1000 random Person instances
PERSON_DATA: list[Person] = [
Person(fake.name(), random.randint(18, 70), fake.city(), random.choice(countries))
for _ in range(1000)
]
def is_older_than(age: int, threshold: int = 21):
return age >= threshold
def main() -> None:
filtered_data: list[Person] = []
for person in PERSON_DATA:
if person.age >= 21:
filtered_data.append(person)
# filter persons by age
filtered_data2 = list(itertools.filterfalse(lambda x: not is_older_than(x.age), PERSON_DATA))
filtered_data2.sort(key=operator.attrgetter("country"))
an_iterator = itertools.groupby(filtered_data2, key= lambda p : p.country)
summary = {country: len(list(group)) for country, group in an_iterator}
print(f"Summary: {summary}")
if __name__ == "__main__":
main()
Nice solution Manuel!
There are some remarks I would like to make:
* Variables that are named with uppercase letters are usually not calculated, keep it lowercase if it is calculated
* Using
listis not needed here, it will return the same type that is inputed, which in this case is a list* Typing for
PERSON_DATAis not needed, it will be inferred for the list itself. However, it is not wrong, if you want to keep it, do it. However, then I would argue that variables needs constants thenThanks, Andreas for your suggestions, will keep them in mind
I used the Counter function from the collections library
import collections as c
import itertools as it
def main()....
filtered_data = list(it.filterfalse(lambda person: person.age < 20, PERSON_DATA))
summary: dict[str, int] = {}
summary = dict(c.Counter(person.country for person in filtered_data))
Nice solution! Some minor improvements can be made. First, directly import the
Counterobject instead of importing the whole library. Second, we do not need to create a summary variable with the type annotation before the counter call. That way, we also do no need to set the type annotation, since it can be inferredwhy the groupby does not work without sorting the data first ?
Hi Roberto, groupby groups *consecutive items* from an iterable that have the same key value. This is why the iterable should be sorted by the key before using `groupby`, otherwise items with the same key that are not consecutive won't be grouped together.
Let's say you have the following list of numbers and you want to group them by their value:
numbers = [1, 2, 2, 1, 3, 2]
If you were to use groupby`directly on this list, you would get:
1: [1]
2: [2, 2]
1: [1]
3: [3]
2: [2]
This is because groupby simply groups consecutive items with the same key. The number 1 appears in two separate groups and the number 2 appears in two separate groups as well.
If you first sort the list (so the list becomes [1, 1, 2, 2, 2, 3]), then using groupby will result in this:
1: [1, 1]
2: [2, 2, 2]
3: [3]
In the challenge code, sorting ensures that all persons from the same country are adjacent to each other, so groupby can group them into a single group. Hope that clarifies it!
Thanks a lot ! I thought it worked like pandas groupby
You're welcome!
Great problem!
Looking at a couple of different approaches (filterfalse vs generator expressions, groupby vs defaultdict vs Counter) was a nice refresher on the cost of sorting lists:
timeit(filterfalse_groupby, number=1000)
timeit(filterfalse_defaultdict, number=1000)
timeit(filterfalse_counter, number=1000)
timeit(generator_groupby, number=1000)
timeit(generator_defaultdict, number=1000)
timeit(generator_counter, number=1000)
2.442
1.301
1.280
2.255
1.146
1.145
Boosting the Persons list from range(10_000) to range(100_000):
50.180
21.197
21.496
47.090
15.741
15.362
Thanks for posting those numbers - definitely a big cost difference between each!
You could also combine with collections to write in 2 lines:
filtered_data = itertools.filterfalse(lambda person: person.age < 21, PERSON_DATA)
summary = Counter(person.country for person in filtered_data)